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Coronary artery disease (CAD) is the leading cause of death worldwide. Affected individuals cluster in fam- 
ilies in patterns that reflect the sharing of numerous susceptibility genes. Genome-wide and large-scale gene- 
centric genotyping studies that involve tens of thousands of cases and controls have now mapped common 
disease variants to 34 distinct loci. Some coronary disease common variants show allelic heterogeneity or 
copy number variation. Some of the loci include candidate genes that imply conventional or emerging risk 
factor-mediated mechanisms of disease pathogenesis. Quantitative trait loci associations with risk factors 
have been informative in Mendelian randomization studies as well as fine-mapping of causative variants. 
But, for most loci, plausible mechanistic links are uncertain or obscure at present but provide potentially 
novel directions for research into this disease's pathogenesis. The common variants explain ~4% of inter- 
individual variation in disease risk and no more than 13% of the total heritability of coronary disease. 
Although many CAD genes are presently undiscovered, it is likely that larger collaborative genome-wide 
association studies will map further common/low-penetrance variants and hoped that low-frequency or 
rare high-penetrance variants will also be identified in medical resequencing experiments. 



INTRODUCTION 

Coronary artery disease (CAD) is the most frequent cause of 
death in high-income countries and the second most 
common cause of death in medium and low-income countries 
(1). It most commonly presents clinically in cases of angina 
pectoris and myocardial infarction (heart attack), which are 
due to atherosclerotic plaques that develop progressively as 
we age and occasionally rupture. Genetic epidemiological 
studies of family history and twin concordance studies are 
consistent with an underlying multifactorial model of disease 
susceptibility with a significant polygenic component. This 
is complemented by genetic analysis of heritable conventional 
risk factors such as low-density lipoprotein (LDL)-cholesterol 
and systolic blood pressure, which collectively might explain a 
minor portion of coronary disease risk (2). 

Over the past 4 years, researchers have completed several 
genome-wide association studies (GWASs) to map underlying 
common susceptibility variants for coronary disease. In paral- 
lel with GWASs of other complex diseases, it was soon 



apparent that typical effect sizes for individual single- 
nucleotide polymorphisms (SNPs) were fairly small, so large 
sample sizes would be required for reliable gene mapping. 
This has encouraged collaboration between individual re- 
search groups and led to the formation of consortia to pool 
the results of GWASs using meta-analysis techniques. Pro- 
gress has been facilitated by the availability of phased haplo- 
type training sets (notably, from the HapMap project: http 
://hapmap.ncbi. nlm.nih.gov/downloads/phasing) and the ac- 
companying genotype imputation software (for example, 
MACH: www.sph.umich.edu/csg/abecasis/MACH/index.html 
or IMPUTE: mathgen.stats.ox.ac.uk/impute/impute_v2.html). 
These population genetic resources and statistical genetic 
tools provide an efficient solution to the fact that individual 
GWASs are often carried out on different SNP arrays with 
variable SNP overlap. 

All this effort came to a crescendo this year with the publi- 
cation of two papers from the CARDIoGRAM (3) and C4D (4) 
consortia that together scanned nearly 40K coronary disease 
cases for susceptibility gene signals. Combined with the 
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Table 1. Thirty-five common susceptibility variants for coronary artery disease 
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Chr, chromosome; Position, position (in bp) on GRCh37/hgl9 (Genome Reference Consortium February 2009); EAF, effect allele frequency; OR, odds ratio; 
A 2 SNP , SNP-specific heritability estimates are shown for three disease prevalence estimates; K p disease prevalence estimate for SNP-specific heritability estimate; 
A 2 lota i total SNP-encoded heritability for each disease prevalence estimate are shown in bold type. 
a Most locus assignments are provisional based on proximity (see text). 
b Effect allele frequency and odds ratio are given for Chinese Han population. 



results from other recent large-scale studies (5-7), 35 
common coronary disease variants have been robustly 
mapped by GWASs (3-7) or gene-centric SNP arrays (8) 
(Table 1). These results are mostly based on cases and controls 
of European descent. The C4D (4) and gene-centric (8) dis- 
covery experiments included South Asians from Pakistan 
and India as well as Europeans; these designs were optimally 
powered to detect variants that were common to both ancestry 
groups. Wang et al. (7) earned out their GWAS discovery and 
replication experiments in the Chinese Han population. 

For some of the SNPs, there is circumstantial evidence to 
highlight an underlying gene. For example, LIPA encodes 
lipase A, which catalyzes the hydrolysis of cholesteryl esters 
and triglycerides. The lead CAD risk SNP in LIPA (4,8) is 
strongly correlated with expression quantitative trait loci 
(eQTL) SNP, with the CAD risk allele correlated with 
increased expression of LIPA mRNA in monocytes (9) and 



liver (4), suggesting a functional relationship between the 
disease association signal and this candidate gene. However, 
for most of the SNPs mapped by GWASs, it is difficult to im- 
plicate an underlying gene. Inspection of recombination fre- 
quency maps derived from the HapMap project suggest 
genetic boundaries defined by recombination hotspots, which 
drives the traditional approach of fine-mapping using haplo- 
type block information (10). However, there is ample evidence 
that cw-regulatory mechanisms in the soma can operate over 
tens, hundreds or even millions of base pairs and are presum- 
ably unaffected by meiotic recombination (1 1). It is becoming 
possible to define functional genomic boundaries based on 
chromatin architecture-related factors, such as CCCTC- 
binding factor sites (12) that are found in the vast majority 
of vertebrate insulator elements. We suspect that, for most 
coronary disease loci, it will take some time to complete the 
sequence of functional genomics experiments that will be 
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required to confidently lay the blame of disease susceptibility 
on a specific underlying gene. The loci shown in Table 1 
should therefore be interpreted as provisional assignments 
based mainly on proximity (nearest coding sequence). 

It is noteworthy that the lists of confidently assigned genes 
across the recent crop of studies show little overlap, which at 
first glance seems unlikely to be due to differences in tagging 
SNP coverage, phenotypic heterogeneity or ancestry (these 
studies are heavily weighted towards European ancestry). 
We suspect that it most likely has its roots in the small 
effects conferred by the susceptibility genes (each allele 
increases risk by ~5%). Such small gene effects would, in a 
GWAS discovery experiment comprising 20K cases and 
20K controls, be expected to have only a 5% probability of 
passing even a modest stage- 1 (tentative discovery) threshold 
of P < 0.00001. Therefore, in absolute terms, this low power 
to detect an individual gene means that different, and only 
occasionally overlapping, sets of susceptibility loci are likely 
to emerge from similarly sized studies (13,14). Of course, 
even for loci that have surpassed the de facto genome-wide 
significance threshold of 5 x 10~ 8 , there is still an appreciable 
chance that one or more of the 34 loci will prove to be false- 
positives (15). We expect based on binomial theory that the 
maximum number of false-positive associations in Table 1 
will be three or fewer. Skol et al. (16) pointed out that a com- 
bined analysis of stage 1 discovery data (based on GWAS SNP 
arrays) and stage 2 validation data (a subset of the most prom- 
ising SNPs from stage 1) is more efficient than attempting 
formal replication in stage 2 and such an analysis is now 
standard practise. However, some researchers, mindful of 
historical difficulties of interpreting complex genetic data 
(17), prefer to apply a cautious approach to secure robust 
(i.e. taking into account stage 2 multiple testing penalties) 
and independent (of GWAS discovery) replication data (3,4). 

GWAS is a 'hypothesis-free' approach to the study of 
complex diseases, which depends on mutations that unpredict- 
ably occurred in previous generations, and purely by chance 
(genetic drift) or by (balancing) selection is now common. 
As such the loci that are identified provide a framework for 
what we know and do not know about pathogenesis from 
other hypothesis-led (e.g. physiological, biochemical or cell 
biological) experiments. In the case of coronary disease genet- 
ics, we have examples that illustrate complex genetic architec- 
tural features such as allelic heterogeneity, pleiotropy, risk 
factor QTLs, copy number variation (CNV) or synthetic asso- 
ciations, which are discussed below. There are findings that 
could lead to tangible clinical benefits relatively soon {LP A 
and SORT1). But mainly there are leads to point researchers 
in unanticipated directions, at least some of which we hope 
will provide novel biological insights into how atherosclerotic 
plaques develop and rupture. 



HOW MUCH HERITABILITY HAS BEEN MAPPED? 

Manolio et al. (18) have pointed out that despite the success of 
GWAS in mapping common susceptibility variants for many 
multifactorial diseases, collectively these variants typically 
explain a modest fraction of the total heritability of these con- 
ditions. The accuracy of such calculations depends on the 



fidelity of a series of locus-specific heritability estimates as 
well as the total (i.e. measured plus unmeasured) heritability. 

Locus-specific heritability estimates are based on odds ratio 
estimates that assume that the lead SNP at a locus accurately 
tags the disease-causing variant. No correction is usually made 
for potential biases due to the 'winner's curse' (19) or to signal 
attenuation due to clinically unscreened control data (20). 
External (to the case -control data) epidemiological informa- 
tion on disease prevalence is required if the multifactorial 
threshold model is to be used to calculate locus-specific herit- 
abilities. Prevalence estimates can vary substantially with clin- 
ical phenotype, sex, and age and are drifting over time with 
changing environmental risk factor exposures (www.heartsta 
ts.org); we suggest that a range of prevalences between 2 
and 10% is relevant to coronary disease GWAS case series. 

Invaluable coronary disease heritability data are derived 
from the longitudinal study of over 20K twins in the 
Swedish Twin Registry (Karolinska Institute, Stockholm, 
Sweden). The total heritability of coronary disease was esti- 
mated for angina in men as 39% (95% CI 29-49%) and in 
women as 43% (8-51%) (21) and for death from coronary 
disease in men as 57% (45-69%) and in women as 38% 
(26-50%) (22). Genetic association studies generally use 
samples collected from survivors of disease, for ethical and 
other pragmatic reasons. For coronary disease, case series 
are clinically heterogeneous if they include different diagnos- 
tic subgroups (e.g. chronic stable angina or myocardial infarc- 
tion) with subtly differing pathologies that might have an 
impact on susceptibility. So, we propose that a heritability 
estimate of 40% will encompass the clinical heterogeneity 
across typical GWAS case series. 

Taking all of these issues into account, we estimate that 
between 8 and 13% of the total heritability of coronary 
disease can be explained by the 35 common variants 
(Table 1). So, the vast majority of the heritability is currently un- 
explained. X-linked single-gene disorders have intrinsic advan- 
tages for gene mapping, so it is a pity that the X-chromosome has 
sometimes been overlooked in the search for coronary disease 
susceptibility loci (contrast the CARDiOGRAM study which 
was based on imputed data with C4D which was based solely 
on genotype data). This can easily be resolved as appropriate 
analytic means (i.e. phased haplotype training sets and 
imputation software) are now freely available and have 
proved productive in the investigation of the X chromosome 
in other disease areas [e.g. type 2 diabetes (23)]. 

Although there was little evidence for non-additive genetic 
effects in the aforementioned twin studies of coronary 
disease, we note that classic MZ/DZ concordance studies 
have very limited power to identify non-additive variance 
components (24). Moreover, dominance and epistatic vari- 
ance components are inevitably confounded in this design 
(25). Consequently, there seems no reason why epistatic 
and high-penetrance/low-frequency alleles should not 
explain a portion of the missing heritability. Indeed, some 
of the linkages detected in earlier affected-sib-pair studies 
(26,27) might be conferred by low-frequency alleles with 
or without allelic heterogeneity. For example, the locus on 
chromosome 17p reported by PROCARDIS (27) was asso- 
ciated with a sibling recurrence risk ratio (\ S ib) of 1.29 that 
could theoretically be conferred by a dominant, low- 



Human Molecular Genetics, 2011, Vol. 20, Review Issue 2 R201 



frequency (0.5%) allele of intermediate penetrance (17.2%) 
and with a 1.8% phenocopy rate. Such linkage signals are 
usually intractable to conventional GWAS based on 
common SNPs (28) but might be resolved by means of 
resequencing-based analyses or genotyping arrays with 
good coverage of low-frequency variants. 



SUSCEPTIBILITY ENCODED BY CNV 

The development of array-based methods to systematically 
study CNV has allowed researchers to study the role of this 
rich source of genetic variation in common multifactorial 
diseases (29). It is ironic that these high-throughput techniques 
with genome-wide coverage of CNVs have overlooked an 
exemplar of common disease susceptibility namely that 
encoded by the apolipoprotein(a) (LPA) gene. This gene 
includes a highly variable number of kringle IV-2 repeats 
(range at least 12-44) which result in numerous isoforms 
that can be typed by protein electrophoresis (30) or genomi- 
cally quantified by qPCR (30-32). Two SNPs that tag short 
isoform alleles that are encoded by relatively low copy 
numbers of kringle IV-2 sequences show strong associations 
with high lipoprotein(a) levels and with coronary disease 
risk (OR= ~ 1 .5) (30). These SNPs were not included on com- 
monly used GWAS SNP arrays (but were fortuitously included 
in the design of the HumanCVD gene-centric SNP array) 
and have been recalcitrant to genotype imputation due to 
their frequency or linkage disequilibrium properties. Conse- 
quently, they have not been assessed in GWAS meta-analyses 
of coronary disease susceptibility. 



ALLELIC HETEROGENEITY AND PLEIOTROPY 

Allelic heterogeneity is a regular feature of complex diseases 
and traits. For example, multiple independent signals were 
detected at 19 of 180 height QTL (33). Researchers have 
systematically scanned for secondary association signals of cor- 
onary disease (by conditioning on the lead SNP at each locus), 
an approach that identified multiple independent SNP signals in 
LPA (30). The coronary disease associations in PCKS9 detected 
by the non-synonymous R46L SNP rsl 1591 147 (34,35) and a 
non-coding SNP rsl 1206510 (36) appear to be independent as 
the two SNPs show little linkage disequilibrium (r = 0.04 in 
PROCARDIS Human CVD data). A possible example of 
allelic heterogeneity arises for SNPs rs3825807 (3) and 
rsl 9940 16 (6) that map to alternate flanks of the ADAMTS7 
gene on chromosome 15 to rs4380028 (4) and are in moderate 
linkage disequilibrium (r 2 = ~0.50). 

Coronary disease shows substantial clinical heterogeneity 
that is reflected in morphological differences in the athero- 
sclerotic plaques (37) that might in turn reflect differences in 
inherited susceptibilities. Plaque rupture and subsequent cor- 
onary thrombosis causes acute coronary syndromes such as 
myocardial infarction, thereby motivating searches for genes 
that might influence plaque stability. Reilly et al. (6) under- 
took a GWAS of coronary disease patients with angiographic 
disease contrasting those cases that had suffered myocardial 
infarction with those who had not. This study mapped a 
novel association to the ABO blood group system with 



SNPs that strongly tag the O allele. The CARDIoGRAM con- 
sortium (3), which studied a mixture of coronary disease cases 
of which two-thirds had suffered a myocardial infarction and a 
mixture of screened and unscreened controls, also mapped a 
susceptibility signal to the ABO system. Their lead SNP is 
only in moderate linkage disequilibrium (r 2 = 0.39) with the 
Reilly et al. (6) signal so may be due to allelic heterogeneity 
or pleiotropy. We must await further fine-mapping studies of 
the ABO and ADAMTS7 loci to fully understand the details 
of these associations. 



QTL MAPPING AND CONVENTIONAL 
RISK FACTORS 

Conventional risk factors for coronary disease such as circulat- 
ing lipid levels and blood pressure are heritable (quantitative) 
traits. There has been much effort in mapping common QTL 
for these traits using GWAS or other large-scale SNP arrays 
in population-based as well as case-control samples (38— 
40). Technical difficulties such as uncontrolled fasting status 
or on-treatment measurements have been largely overcome 
to produce a rich crop of risk factor QTL. Notable examples 
of overlap with coronary disease loci (Table 2) include 
TRIB1 (Drosophila tribbles homologue, a gene that interacts 
with the mitogen-activated protein kinase cascade), which 
has pleiotropic effects on circulating triglyceride, LDL- and 
HDL-cholesterol levels (40) and CYP17A1 (17-a hydroxylase 
gene involved in steroid hormone metabolism) which is a sys- 
tolic blood pressure QTL (38,39). However, most (22 of 34) of 
the coronary disease loci do not show convincing risk factor 
QTL effects (Table 2). It may be that novel heritable inter- 
mediate phenotypes will eventually be identified that will 
explain some of the disease associations; this should lead to 
informative insights into pathological mechanisms. Indeed, 
expectations that coronary genes would be involved in 
innate immunity or thrombosis have not emerged from 
large-scale genetic association studies to date. 

MOVING FROM ASSOCIATED LOCUS 
TO CAUSATIVE GENE 

Robust assignments of common susceptibility variants are to 
be applauded, but it may take some time to resolve the under- 
lying molecular genetic mechanisms. For instance, the first 
coronary disease locus to confidently emerge from GWAS 
mapped to chromosome 9p21 to a region that was initially 
believed to be a gene desert. However, it was quickly recog- 
nized in fine-mapping studies (10) that the associated region, 
which was of prior interest to cancer genetics researchers 
(41,42), was potentially linked to neighbouring cyclin- 
dependent kinase inhibitor genes CDKN2A (which has 
multiple synonyms including pi 6, see www.genecards.org/ 
cgi-bin/carddisp.pl?gene=CDKN2A for more details) and 
CDKN2B (pl5, see www.genecards.org/cgi-bin/carddisp.p 
l?gene=CDKN2B). Subsequent studies of murine models 
have highlighted a Cdkn2a/b-mediated mechanism involving 
smooth muscle cell proliferation (43). Studies of human tran- 
scription enhancer elements propose that the large non-coding 
antisense RNA molecule CDKN2ABAS, which is also known 
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Table 2. Risk factor QTL and coronary artery disease 
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rsl 746563 7 






ABCG8 


LDL 


rs4299376 


rs4299376 


1.00 


Teslovich et al. (40) 


WDR12 






rs6725887 






MRAS 






rs2306374 






IL5 






rs2706399 






C6orfl05 






rs6903956* 






PHACTR1 






rsl2526453 






ANKS1A 






rsl 7609940 






TCF21 






rsl2190287 






LPA 


Lp(a) 


rsl0455872 


rsl 0455872 


1.00 


Clarke et al. (30) 


LPA 


Lp(a) 


rs3798220 


rs3798220 


1.00 


Clarke et al. (30) 


LPA 


LDL, TC 


rsl564348 


rsl0455872 


<0.30 


Teslovich et al. (40) 


LPA 


HDL 


rsl084651 


rsl 0455872 


NA b 


Teslovich et al. (40) 


7q22 






rsl0953541 






ZC3HC1 






rsl 1556924 






TRIB1 


TG, TC, LDL, HDL 


rs2954029 


rsl 0808546 


0.96 


Kathiresan et al. (64), Teslovich et al. (40) 


ANRIL/CDKN2BAS 






rs4977574 






ABO 


LDL, TC 


rs9411489 


rs579459 


<0.30 


Teslovich et al. (40) 


KIAA1462 






rs2505083 






CXCL12 






rsl 746048 






LIPA 






rsl412444 






CYP17A1-NT5C2 


blood pressure 


rsl 1191548 


rsl2413409 


1.00 


Newton-Cheh et al. (38) 


PDGFD 






rs974819 






APOA1-C3-A4-A5 


TG, HDL 


rs964184 


rs964184 


1.00 


Kathiresan et al. (64), Teslovich et al. (40) 


SH2B3 


blood pressure 


rs3 184504 


rs3 184504 


1.00 


Levy et al. (39) 


COL4A1-A2 






rs4773144 






HHIPL1 






rs2895811 






ADAMTS7 






rs4380028 






SMG6-SRR 






rs216172 






PEMT 






rsl2936587 






GIP-ATP 






rs46522 






LDLR 


LDL, TC 


rs6511720 


rsl 122608 


<0.30 


Teslovich et al. (40) 


APOE 


LDL, TC, HDL 


rs4420638 


rs2075650 


0.40 


Kathiresan et al. (64), Teslovich et al. (40) 


MRPS6 






rs9982601 







r 2 , measure of linkage disequilibrium between the lead QTL SNP and the lead risk SNP; LDL, LDL-cholesterol; HDL, HDL-cholesterol; TC, total cholesterol; TG, 
triglycerides. 

a Most locus assignments are provisional based on proximity (see text). 

b SNP rsl 084651 had a >5% genotyping failure rate in the HapMap database Rel23. 



by the monikers ANRIL and CDKN2B-AS1, is involved in the 
long-range transcriptional regulation of several genes, includ- 
ing CDKN2A and CDKN2B, in vascular endothelial cell lines 
(44). 

Another notable success story followed the overlap of 
LDL-cholesterol QTL and coronary disease association 
signals on chromosome lp. Here eQTL were particularly in- 
formative to resolve the role of SORT1 which encodes sortilin 
from it's neighbours (45). Parallel functional studies have 
implicated sortilin as a novel regulator of lipoprotein produc- 
tion in the liver (46), thereby providing a mechanistic link to 
the coronary disease susceptibility although some details of 
the mechanism need to be reconciled. 



DISCREPANCIES BETWEEN MEASURED 
AND PREDICTED GENETIC RISK 

Using results from proxy SNPs that are in almost complete 
linkage disequilibrium with each other, the CARDIoGRAM 



and C4D studies provide a joint SORT1 per-allele risk estimate 
equal to 1.12 (1.09-1.15). The measured per-allele QTL effect 
on LDL-cholesterol is equal to 0.145 mmol/1 (0.135-0.155) 
(40). Substituting the latter QTL effect into the Framingham 
coronary heart disease risk equation (47) predicts a per-allele 
relative risk equal to 1.042 (1.039-1.045). So, the coronary 
disease risk estimate derived from GWAS is substantially 
higher than that predicted from the effect on LDL-cholesterol 
levels derived from long-term prospective studies. 

A similar discrepancy was noted in a genotype risk score ana- 
lysis of the Malmd Diet and Cancer study (48). Given the under- 
lying sample sizes (>100K for the lipid QTL GWAS 
experiment) and the consequent precision of the risk factor 
effect sizes, it seems unlikely that these have been systematic- 
ally underestimated. It is possible that the disease risk estimates 
have been overestimated [e.g. winner's curse (19)] and/or the 
disease associations are only partially mediated through the ac- 
companying risk factor QTL effect (i.e. pleiotropy). Moreover, 
the Framingham risk equations, which were based on US popu- 
lation data from 1970s onwards, were based on single (baseline) 
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cholesterol measurements. Cholesterol measurements are 
subject to short-term (e.g. variable fasting) as well as long-term 
variation (e.g. changes in diet). Such within-individual vari- 
ation can result in the systematic underestimation of the 
strength of a risk factor association with disease, an epidemio- 
logical effect known as regression dilution bias (49). Whatever 
the explanation, these findings emphasize that genetic 
epidemiological inferences from cross-sectional data need 
cautious interpretation and that information from prospective 
studies (e.g. UK BIOBANK, www.ukbiobank.ac.uk) will be 
very informative. 



homozygotes) as 0.80 (95% CI 0.70-0.90) (53); the signifi- 
cance of this association is approximately P = 0.0005, four 
orders of magnitude below genome-wide significance. 
GWAS, even when enhanced by genotype imputation, may 
not accurately tag specific candidate gene variants [e.g. LPA 
and rsl0455872 (30)]. So, the absence of a GWAS signal 
cannot be assumed to negate prior candidate or positional 
cloned genes (8). Indeed, the design of the HumanCVD gene- 
centric SNP array has revealed several novel loci LIP A, IL5, 
TRIB1 and ABCG5/ABCG8 as well as finally robustly confirm- 
ing the candidature of APOE (8). 



GENETIC INSIGHTS FOR EMERGING 
RISK FACTORS 

Common variants that show quantitative genetic variation for 
disease risk factors or intermediate phenotypes can probe the 
putative causal relationship between risk factor and disease. 
This Mendelian randomization (MR) information (50) is par- 
ticularly useful when there are no drugs that specifically 
modulate the exposure. For instance, Lp(a) is an LDL particle 
that appears to be proatherogenic in cross-sectional and 
prospective epidemiological studies. But drugs such as 
niacin that reduce Lp(a) concentrations also beneficially in- 
crease the HDL levels. So, it can be difficult in randomized 
clinical trials (RCT) to unambiguously attribute any clinical 
benefit to specific mechanisms. Following a large-scale candi- 
date gene study, two SNPs in the apolipoprotein(a) gene were 
shown to tag short isoform alleles that were strikingly asso- 
ciated with raised Lp(a) concentrations and coronary disease 
risk (30) (also discussed in CNV and allelic heterogeneity sec- 
tions above). Simultaneous modelling of disease risk and 
quantitative genetic variation were consistent with a direct 
causal link, thus predicting that pharmacological lowering of 
Lp(a) levels will be beneficial to patients. Loci that carry tri- 
glyceride QTL such as TRIB1 and APOA1-C3-A4-A5 also 
show QTL effects for HDL and LDL. Consequently, these 
pleiotropic loci will not be useful for MR probing of the 
role of triglyceride, a well-studied lipid that is presently not 
routinely included in cardiovascular risk calculations. 

CANDIDATE AND POSITIONALLY 
CLONED GENES 

Before the GWAS epoch, there was much effort expended in 
scanning candidate genes, those genes with known or pre- 
dicted functions that might be involved in coronary disease 
pathogenesis, for susceptibility variants. This research was 
supplemented by positional cloning experiments that were un- 
biased in terms of gene candidature (51). In comparison with 
the levels of statistical support required for GWAS, the evi- 
dence for most of the candidate genes was modest despite 
meta-analyses involving up to 36K subjects (52). For instance, 
genetic variation in the apolipoprotein E gene (APOE) has 
well-known effects on LDL-cholesterol and is a highly 
plausible coronary disease candidate gene. However, a 
meta-analysis of 17 studies with at least 500 cases that 
included 21 331 cases and 47 467 controls estimated the risk 
of carrying the protective e2 allele (versus s3/ s3 



FUTURE DIRECTIONS AND PROSPECTS 

Models of the genetic architecture of complex traits (33) 
predict that large numbers of small effect susceptibility loci 
remain to be discovered, some of which should be tractable 
to well-powered GWAS. The momentum amongst researchers 
to meta-analyse GWAS data will be sustained as larger 
consortia (e.g. the recently merged CARDIoGRAMplusC4D 
consortium) are formed. They can take full advantage of initia- 
tives such as the Metabochip project, a custom array contain- 
ing 196 725 SNPs that builds on the CARDIoGRAM stage- 1 
discovery results. So, it is reasonable to expect that the 
number of common coronary disease variants will increase, al- 
though as each novel variant will be associated with increas- 
ingly tiny effects, it seems that the missing heritability gap 
will never be filled by common variants alone. 

The compilation of an exhaustive list of human genetic 
variation through the dbSNP (http://www.ncbi.nlm.nih.gOv/p 
rojects/SNP) and HapMap (http://hapmap.ncbi.nlm.nih.gov/) 
projects is being systematically expanded by the 1000 
Genomes project (http://www.1000genomes.org). This is par- 
ticularly useful for low-frequency variants (minimum allele 
frequency <5%), which were largely absent from the early 
genome-wide SNPs arrays, which were designed to type 
common variants in GWAS. Synthetic associations due to 
tagging of rare variants by common SNPs or haplotypes com- 
posed of common SNPs can occasionally be detected (54); 
this is particularly useful if the rare variant might disrupt 
gene function (e.g. non-synonymous SNP). Consequently, 
analyses of imputed genotypes derived from the 1000 
genomes project are an immediate research priority. This is 
encouraged by the detection of a haplotype association with 
coronary disease (55) that subsequently was partially 
explained as a synthetic association to an LPA SNP (30). 
Imputation-based analysis will be complemented by whole- 
genome or exome resequencing experiments aimed at identi- 
fying low-frequency variants and unique high-penetrance 
mutations; together, these approaches have the potential to 
reveal novel disease mechanisms. 

Finally, as the list of mapped disease variants expands, and 
fine-mapping and functional genomic studies refine loci to 
resolve underlying genes, pathway and network analysis 
should prove useful to provide systems level insights into 
coronary disease pathogenesis (56,57). 
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